Comments about the article in Nature: Machine learning comes up against unsolvable problem

Following is a discussion about this article in Nature Vol 565 17 January 2019, by Davide Castelvecchi
To study the full text select this link: https://www.nature.com/articles/d41586-019-00083-3


1. Introduction

The mathematicians, who were working on a machine-learning problem, show that the question of 'learnability' - whether an algorithm can extract a pattern from limited data - is linked to a paradox known as the continuum hypothesis.
This sentence is not clear. The sentence does not contain enough information. Adding: 'whether or not an algorithm etc' makes the whole sentence clearer.
Gödel showed that this cannot be proved either true or false using standard mathematical language.
Also this sentence does not contain enough information.

1. Not all sets are equal

Researchers often define learnability in whether an algorithm can generalize its knowledge.
I have to repeat again: Also this sentence does not contain enough information.
The algorithm is given the answer a 'yes or no' question - such as "Does this image show a cat" - for a limited number objects, and then has to guess the answer for new objects.
Specific see: Reflection 2 - Image selection experiment
Yehudayoff and his collaborators arrived at their result while investigating the connection between learnability and 'compression' which involves finding a way to summarize the salient features of a large of a large data set in a smaller set of data.
Here the problem is attacked more from a mathematical point of view. In real the problem is more physical.
The authors discovered that the information's ability to be compressed efficiently boils down to a question in the theory of sets - mathematical collections of objects such as the sets in Venn diagrams.
In theory this is correct. Image selection in some sense belongs to the mathematical problem of to which set (part of a Venn diagram) a specific image belongs. In practice this is very complex. First you have to unravel which are the parameters of each set.
For example which are the parameters that define a cat and which that define a dog. That is a physical problem.
In particular it relates to the different sizes of sets containing infinitely many objects.
It does not relate to the number of pictures (of cats) that there are in each set. The problem is the parameters that define the differences between the pictures.
George Cantor etc demontrated etc that not all infinite sets are created equal: in particular the set of integer numbers is smaller than the set of all real numbers, also known as the continuum.
The second part is true, but the first part is not clear.
However the second part has nothing to do with learnability i.e. to answer the question: "Does this image show a cat"
Their efforts were in vain. A 1940 result by Gödel ( wwhich was completed in the 1960 by Paul Cohen) showed that the continuum hypothesis canot be proved either true or false etc
Okay.
Gödel and Cohen's work on the continuum hypothesis implies that there can exist parallel mathematical universes that are both compatible with standard mathematics - one in which the continuum hypothesis is added to the standard axioms and therefore declared to be true and another in which it is declared false.
That solution is called "The egg of Columbus".
In one universe everyone is satisfied. In the other universe no one is satisfied.

2. Learnability Limbo

In this way, they show that the problem of learnability is equivalent to the continuum hypothesis.
Therefore, the learnability problem, too, is in a state of limbo that can be resolved only by choosing the axiomatic universe.
In order to solve the learnability problem you first have to define what you mean.
The issue is first what do we mean with learning?
If learning involves to answer the question does this picture show a cat or a dog, than you first have to train the person which different pictures which either show a cat or a dog such that he can learn by observing what the differences are. In a second phase you can test the person by showing him a picture of either a cat or a dog, if he answers corretly.
In no way parallel universes are involved.

In the same way you can do the same by using a computer program. To write such a program by making no reference what the pictures show is very difficult.

Researchers have discovered a number of similar 'undecidable' problems, says Peter O'Heran a computer scientist at etc.
You need the details of these problem to have any fruitful discussion.
In particular, following work by Gödel, Alan Turing found a class of questions that no computer program can be guaranteed to answer in any finite number of steps.
How do you know that this program is correct? You can only test such a program if you have an almost similar problem that based on certain parameters either can not be solved or can be solved.
A computer program, which simulates this problem, should give the same results.
Using this program, as a base, implementing certain modification, you can then try to simulate the original problem (which has no solution).


Reflection 1 - Pavlov experiment

The classical pavlov experiment consists of a dog and some food. In the experiment the salvation process is measured.
In this particular case the dog is shown different images one of which is a square. In front of the dog there are two pushbuttons. When the dog pushes the correct button food drops in front of him. In this case when there is a square the dog should press the left button.
In the next case a circle is shown. The dog should press the right button.
In the next cases there are two possiblities: In the final case the image is neither a circle nor a square. The dog does not know what to select and becomes mad.


Reflection 2 - Image selection experiment

The image selection experiment resembles very much the pavlov experiment. The dog is replaced by a computer program (an algorithm).

To write a general purpose program which does not contain any specific data about the images shown is very difficult. With general purpose I mean that the same program can also be used if the images are "an apple or a pear" or "a man or a woman" or what ever you want. That is difficult.



Reflection 3 - Machine learning

In Reflection 1 a dog is used to select. The result of the experiment is that in some cases the dog can not select.
In Reflection 2 a general purpose program is used to select. The main issue of the program is that the program should be heavily trained in order to distinquish the different images.
For the selection part the final images should detailed enough (show enough data) to make the correct selection otherwise the program can not operate properly.

We humans have the capability, based on our experience in the past, to identify a hugh number of objects. The question is to what extend it is possible to write a general purpose program that does the same. A special purpose is a program which contains a lot of extra instructions, which are part of the program, which makes this decision process simpler.


If you want to give a comment you can use the following form Comment form


Created: 25 January 2019

Back to my home page Index
Back to Nature comments Nature Index